Clustering Technology of a Data Engine for Analytical Computing
نویسندگان
چکیده
Contemporary scientific studies frequently rely on data-intensive analytical computing. While the main goal of this emerging form of computing is to facilitate hypothesis formulation or to test the validity of a postulated model, its primary method is usually that of data clustering. Since typical analytical tasks operate on very large volumes of potentially highdimensional data, scientific studies also face enormous problems of scale. This paper describes the clustering technology of a new engine for data-intense analytical computing. The technology is designed to operate in high-dimensional feature spaces without requiring dimensionality reduction. This enables the data engine to achieve high degrees of scalability and high interoperability between the analytical tasks. Most processes supported by the engine operate on a shared aggregate representation of data in the original feature space.
منابع مشابه
Assessment Methodology for Anomaly-Based Intrusion Detection in Cloud Computing
Cloud computing has become an attractive target for attackers as the mainstream technologies in the cloud, such as the virtualization and multitenancy, permit multiple users to utilize the same physical resource, thereby posing the so-called problem of internal facing security. Moreover, the traditional network-based intrusion detection systems (IDSs) are ineffective to be deployed in the cloud...
متن کاملEntropy-based Consensus for Distributed Data Clustering
The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...
متن کاملApplication of Soft Computing Methods for the Estimation of Roadheader Performance from Schmidt Hammer Rebound Values
Estimation of roadheader performance is one of the main topics in determining the economics of underground excavation projects. The poor performance estimation of roadheader scan leads to costly contractual claims. In this paper, the application of soft computing methods for data analysis called adaptive neuro-fuzzy inference system- subtractive clustering method (ANFIS-SCM) and artificial neu...
متن کاملA Clustering Approach to Scientific Workflow Scheduling on the Cloud with Deadline and Cost Constraints
One of the main features of High Throughput Computing systems is the availability of high power processing resources. Cloud Computing systems can offer these features through concepts like Pay-Per-Use and Quality of Service (QoS) over the Internet. Many applications in Cloud computing are represented by workflows. Quality of Service is one of the most important challenges in the context of sche...
متن کاملUtilization of Soft Computing for Evaluating the Performance of Stone Sawing Machines, Iranian Quarries
The escalating construction industry has led to a drastic increase in the dimension stone demand in the construction, mining and industry sectors. Assessment and investigation of mining projects and stone processing plants such as sawing machines is necessary to manage and respond to the sawing performance; hence, the soft computing techniques were considered as a challenging task due to stocha...
متن کامل